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1 Administrivia 

Two additional resources on approximating the permanent 

• Jerrum and Sinclair's original paper on the algorithm 

• An excerpt from Motwani and Raghavan's Randomized Algorithms 

2 Review of Monte Carlo Methods 

Wc have some (usually exponentially large) set V of size Z, and we wish to know how many elements arc 
contained in some subset S (which represents elements with some property we are interested in counting). 
A Monte Carlo method for approximating the size of S is to pick k elements Tiniformly at random from V 
and see how many are also contained in S. If q elements are contained in S, then return as our approximate 
solution Zq/k. In expectation, this is the correct answer, but how tightly the estimate is concentrated around 

I CI 

the correct value depends on the size oi p= jp^. 

Definition 1 An (e, S) approximation scheme is an algorithm for finding an approximation within a 
multiplicative factor o/ 1 ± e with probability 1 — S. 

Using the Chernoff bound, if we sample independently from a 0-1 random variable, we need to conduct 

'log 5^ 



N>e 



trials to achieve an (e, S) approximation, where p = |^ as before. This bound motivates the definition of a 
polynomial-time approximation scheme. 

Definition 2 A fully-polynomial randomized approximation scheme, or FPRAS, is an (e, S) ap- 
proximation scheme with a runtime that is polynomial in n, 1/e, and log 1/5. 

There are two main problems we might encounter when trying to design an FPRAS for a difficult 
problem. First, the S may be an exponentially small subset of V. In this case, it would take exponentially 

many samples from V to get 0{ ^°^^l^^ ) successes. Second, it could be difficult to sample uniformly from a 
large and complicated set V. We will see ways to solve both these problems in two examples today. 



3 DNF Counting and an Exponentially Small Target 

Suppose we have n boolean variables xi,X2-i ■ ■ - Xn- A literal is an Xi or its negation. 

Definition 3 A formula F is in disjunctive normal form if it is a disjunction (OR) of conjunctive 
(AND) clauses: 

F = CiV C2V ...V Cm, 

where each Ci is a clause containing the AMDs of some of the literals. For example, the formula 
F = {xi A X3) V {X2) V (x2 A A a;3 A X4) is in disjunctive normal form. 
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If there are n boolean literals, then there are 2" possible assignments. Of these 2" assignments, we want 
to know how many of them satisfy a given DNF formula F. Unfortunately, computing the exact number of 
solutions to a given DNF formula is ^P-Hard^. Therefore we simply wish to give an e-approximation for the 
number of solutions a given DNF formula that succeeds with probability 1 — d and runs in time polynomial 
in n, m, logi5, and 1/e. 

Naively, one could simply try to use the Monte Carlo method outlined above to approximate the number 
of solutions. However the number of satisfying assignments might be exponentially small, requiring expo- 
nentially many samples to get a tight bound. For example the DNF formula F = {xi A X2 A • • • A Xn) has 
only 1 solution out of 2" assignments. 



3.1 Reducing the Sample Space 

Instead of picking assignments uniformly at random and testing each clause, we will instead sample from the 
set of assignments that satisfy at least one clause (but not uniformly) . This algorithm illustrates the general 
strategy of sampling only the important space. 

Consider a table with assignments on one side and the clauses Ci, C2, . . . , Cm on the other, where each 
entry is or 1 depending on whether the assignment satisfies the clause. Then, for each assignment, we color 
the entry for first clause which it satisfies yellow (if such a clause exists). We color the remaining entries 
satisfied clauses blue, and we set these entries to 0. See Figure 1. 
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Figure 1: A table of assignments versus clauses and how to color it. 

We will sample uniformly from the space of blue and yellow colored entries, and then test whether we've 
sampled a yellow entry. We then multiply the ratio we get by the total number of blue and yellow entries, 
which we can easily compute. 

Let clause Ci have ki literals. Then clearly the column corresponding to Ci has 2""'^' satisfying assign- 
ments (which we can easily compute) . We choose which clause to sample from with probability proportional 
to 2""*'* . Then we pick a random satisfying assignment for this clause and test whether it is the first satisfied 
clause in its row (a yellow entry), or if there is a satisfied clause that precedes it (a blue entry). The total 
size of the space we're sampling from is just J^i 2"~^\ 

^To see this, understand that the negation of a DNF formula is just a CNF formula by application of De Morgan's laws. 
Therefore counting solutions to a DNF formula is equivalent to counting (non-)solutions of a CNF formula, which is the canonical 
example of a #P-Hard problem. 
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Our probability of picking a yellow entry is at least 1/m, where m is the number of clauses, so we can 
take enough samples in polynomial time. Therefore, this algorithm is an FPRAS for counting the solutions 
to the DNF formula. 



4 Approximating the Permanent of a 0-1 Matrix 

Definition 4 (Determinant) For a given n x n matrix M , the determinant is given by 

n 

det{M)^ J2 sgn{TT)Y[M,^^(^,y 

TreS„ i=l 

The formula for the permanent of a matrix is largely the same, with the sgn(7r) omitted. 
Definition 5 (Permanent) For a given n x n matrix M , the permanent is given by 

n 

per{M) = n^^^-W- 

However, while the determinant of a matrix is easily computable — 0{n^) by LU decomposition — 
calculating the permanent of a matrix is ^P-Complete. As we will show, computing the permanent of a 0-1 
matrix reduces to the problem of finding the number of perfect matchings in a bipartite graph. 

4.1 The Permanent of a 0-1 Matrix and Perfect Matchings 

Given an n x n 0-1 matrix M, we construct a subgraph G of -ftr„,„, as follows. Let the vertices on the left be 
Vi,V2, ■ ■ - Vn and let the vertices on the right be Wi,W2, ■ ■ ■ Wn- There is an edge between and Wj if and 
only if Mij is 1. 

Suppose CT is a permutation of {1,2,... n}. Then the product Yii ^i(y(i) is 1 if the pairing {vi^ Wcr(i)) is a 
perfect matching, and otherwise. Therefore, the permanent of M equals the number of perfect matchings 
in G. As an example, we look at a particular 3x3 matrix and the corresponding subgraph of 




Calculating the permanent of a 0-1 matrix is still #P-Complete. As we will see, there is an FPRAS for 
approximating it. 
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4.2 An FPRAS for Approximating the Permanent of a Dense Graph 



4.2.1 Some History 

• 1989: Jerrum and Sinclair showed how to approximate the permanent of a dense graph (all vertices 
have degree at least n/2). At the time, it was not known if this result could be extended to the general 
case. 

• 2001; Jerrum, Sinclair and Vigoda showed how to approximate the permanent of an arbitrary graph 
(and therefore for any matrix with nonnegative entries). 

We will show today the result of 1989 for approximating the permanent of a dense graph. 

4.2.2 General Strategy 

We can't do the nai've Monte Carlo here, since the probability of picking a perfect matching from the set 
of all permutations can be exponentially small. Therefore wc will instead consider the set of all (possibly 
partial) matchings, not just perfect ones. Let Mk be the set of all partial matchings of size k. Now suppose 
that wc had a black box that samples uniformly at random from Mk U Mj^-i for any k. Then by the Monte 
Carlo method, by testing membership in Mk, we can determine the ratio ru = \mI''\\ ■ 

If we assume that for all fc, 1/a < rk < ct for some polynomially-sized a, then we can estimate each Vk 
to within relative error e = using polynomially many samples. Therefore our estimate of the number 
of perfect matchings is just 

n 

|M„| = |Mi|[]ri. 

i=2 

If all of our approximations were within a (1 ± ;;^) factor, then our total error is at most (1 ± A)" ~ 

4.2.3 Bounding the 

We first begin with a crucial lemma. 

Lemma 6 LeA G be a bipartite graph of minimum degree > n/2. Then every partial matching in M^-i has 
an augmenting path of length < 3. 

Proof Let m G Mk-i be a partial matching. Let u be an unmatched node of m. Now suppose that there 
are no augmenting paths of length 1 starting from u in this matching m (i.e. there is no unmatched node 
V such that there is an edge connecting u and v). Then by our degree conditions, u must be connected to 
at least n/2 of the matched nodes tj'. Likewise if we pick an unmatched node v, if it has no augmenting 
paths of length 1, then it must be connected to at least n/2 of the matched nodes u'y But by the pigeonhole 
principle, there must exist i and j such that {upV^) G m. The path {u,vl,Uj,v) is an augmenting path of 
length 3. ■ 

Theorem 7 Let G be a bipartite graph of minimum degree > n/2. Then \/v? < rk < n"^ for all k. 

Proof We first prove that rk < . Consider the function / : Mk Mk-i, which maps m E Mk to its 
(arbitrarily-chosen) canonical representative in Mk-i (i.e. uniquely choose a submatching of m). For any 
m' e Mfe_i, it must be the case that |/-^(m')| < (n - A; + 1)^ < n^. Thus \Mk\ < n^\Mk-i\. 

Now wc show that 1/n^ < r^. Fix some m G Mk- By Lemma 6, every partial matching in Mk-i has 
an augmenting path of length < 3. There are at most k partial matchings in Mk-i that can by augmented 
by a path of length 1 to equal m. In addition, there are at most k{k — 1) matchings in Mk-i that can be 
augmented by a path of length 3 to equal m. Thus \Mk-i\ < (fc + k{k - l))|Mfc| = k'^\Mk\ < n^|Mfc|. ■ 
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4.2.4 How to Scimple (Approximately) Uniformly 

We still have to show how to sample uniformly from Ck = Mk U Mk-i- We will only show how to sample 
approximately uniformly from this set. As it turns out, this result is good enough for our purposes. 

The main idea here is to construct a graph whose vertex set is Ck, and then do a random walk on this 
graph which converges to the uniform distribution. We have to show two things: that the random walk 
converges in polynomial time, and that the stationary distribution on the graph Ck is uniform. To show 
that the random walk mixes quickly, we bound the conductance $(Cfc) by the method of canonical paths. 

Lemma 8 Let G = {V,E) be a graph for which we wish to bound ^{G). For every VjW £ V, we specify a 
canonical path py^^ from v to w. Suppose that for some constant b and for all e G E, we have 

that is, at most b\V\ of the canonical paths run through any given edge e. Then $(G) > — , where dmax 

is the maximum, degree of any vertex. 

Proof As before, the conductance of G is defined as 

e{S) 



$(G) = min 



ScVmin{E„esdH,E.e5'^H}' 

Let S CV.We will show that ^{S) > 453^- Without loss of generality, assume that IS*] < \V\/2. Then 
the number of canonical paths across the cut is at least \S\\S\ > 1511^1/2. For each edge along the cut there 

\S\ 

can be no more than b\V\ paths through each edge, the number of edges e{S) is at least 

In addition we can bound niin{E^gg(i(w),E^g^d(i))} by \S\dmax- These bounds give us 

as claimed. ■ 

Since the spectral gap is at least <i>(G')^, as long as b and dmax are bounded by polynomials, a random 
walk on G will converge in polynomial time. 

4.2.5 The Graph Ck 

We will only do C„. It should be clear later how to extend this construction for all k. Recall that our 
vertices correspond to matchings in M„ U M„_i. We show how to connect our vertices with 4 different types 
of directed edges: 

• Reduce (M„ — > M„_i): If m e M„, then for all e e m define a transition to m' = m — e e M„_i 

• Augment (M„_i — > M„): If m e Afn-i, then for all u and v unmatched with {u,v) G E, define a 
transition to m' = m + (u, v) e M„. 

• Rotate (M„_i — > M„_i): If m e M„_i, then for all {u,w) G m, {u,v) G E with v unmatched, define 

a transition to m' = to + {u, v) — (u, w) G M„_i. 

• Self-Loop: Add enough self- loops so that you remain where you are with probability 1/2 (this gives 

us a uniform stationary distribution) . 

Note that this actually provides an undirected graph since each of these steps is reversible. 

Example 9 In Figure 2 we show C2 for the graph G = K2^2- The two leftmost and two rightmost edges are 
Augment/ Reduce pairs, while the others are Rotate transitions. The self-loops are omitted. 
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Figure by MIT OpenCourseWare. 



Figure 2: C2 for the graph G — K2.2- 

4.2.6 Canonical Paths 

We still need to define the canonical paths p„ for our graph C„. For each node s e Af„UM„_i, we associate 
with it a "partner" s' € M„, as follows: 

• If s e Mn, s' s. 

• If s G M„_i and has an augmenting path of length 1, augment to get s' . 

• If s G M„_i and has a shortest augmenting path of length 3, augment to get s' . 

Now for nodes s,i G M„ U Af„^i, we show how to provide a canonical path ps^t which consists of three 
segments (and each segment can be one of two different types). 

• s — > s' (Type A) 

• s' — > t' (Type B) 

• t' — >t (Type A) 

Type A paths are paths that connect a vertex s G M„ U Af„_i to its partner s' G Af„. Clearly, if s G Af„ 
then the type A path is empty. Now if s G Af„_i and has an augmenting path of length 1, then our canonical 
path is simply the edge that performs the Augment operation. If s G Af„_i and has a shortest augmenting 
path of length 3, then our canonical path is of length 2: first a Rotate, then an Augment (see Figure 3 for 
an example). 

For a Type B path, both s' and t' are in Af„. We let d — s' (S t' , the symmetric difference of the two 
matchings (those edges which are not common to both matchings). It is clear that since s' and t' are perfect 
matchings, d consists of a collection of disjoint, even-length, alternating (from s' or from t') cycles of length 
at least 4. 

Our canonical path from s' to t' will in a sense "unwind" each cycle of d individually. Now, in order 
for the path to be canonical, we need to provide some ordering on the cycles so that we process them in 
the same order each time. However, this can be done easily enough. In addition, we need to provide some 
ordering on the vertices in each cycle so that we unwind each cycle in the same order each time. Again, this 
can be done easily enough. All that remains is to describe how the cycles are unwound, which can be done 
much more effectively with a picture than by text. See Figures 4 and 5. 

We must now bound the number of canonical paths that pass through each edge. First we consider the 
type A paths. 
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Figure by MIT OpenCourseWare. 



Figure 3: Type A path of length 2. 



Lemma 10 Let s G Mn- Then at most 0{n^) other nodes s' € Af„ U Af„_i have s as their partner. 

Proof There are three possible types of nodes s' that have s as their partner. The first is if s' = s 
(hence the type A path is empty). The second can be obtained by a Reduce transition (the nodes s' with 
augmenting path of length 1). The third can be obtained by a Reduce and Rotate pair of transitions. 
There is only partner for the first, 0{n) for the second, and 0{n^) for the third. Therefore there are at most 
O(ri^) nodes s' that can count s' as their partner. ■ 

Now we wish to count the number of canonical paths for type B. 

Lemma 11 Let T be a transition (i.e. an edge o/C„j. Then the number of pairs s,t € M„ that contain T 
on their type B canonical path is bounded by |C„|. 

Proof We will provide an injection aT{s,t) that maps to matchings in C„ — Mn U Mn-i. As before, let 
d = s(Bthe the symmetric difference of the two matchings s and t (recall that these can be broken down into 
disjoint alternating cycles Ci, . . . , C^). Now we proceed along the unwinding of these cycles until we reach 
the transition T. At this point we stop and say that the particular matching we are at, where all cycles up 
to this point agree with s and all cycles after this point agree with t, is the matching that axis^t) maps to. 

It is clear that this is fine when T is a Reduce or Augment transition, since these only occur at the 
beginning or end of an unwinding. The only problem is when T is a Rotate transition, because then there 
exists a vertex u (the pivot of the rotation) that is matched to a vertex v with (u, v) € s and is also matched 
to a vertex w with {u,w) e t. This is because up to T we agree with s, and after T we agree with t. But 
what we can do at this point is notice that one of these two edges (which we denote by e^,*) always has 
the start vertex of the current cycle as one of its end-points. Therefore by removing it we end up with a 
matching again. This is further illustrated in Figure 6. ■ 

Theorem 12 The conductance of our graph has the following bound 
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Alternating cycle Q 




Figure by MIT OpenCourseWare. 
Figure 4: Unwinding a single cycle (type B path). 

Proof By Lemma 8, we have ^{G) > — . As shown in Lemma 10, there are at most O(n^) canonical 
paths to s G Mn from M„_i, and at most 0{tt?) canonical paths from t £ M„ to Af„_i. In addition we 
showed in Lemma 11 that the number of type B paths through a particular transition T is bounded by 
|M„ U Mn-i\ = \V\ (where V is the vertex set of C„). Therefore as a whole, the number of canonical paths 
through a particular transition T is bounded by x |y| x n^, which implies b = n^. 

Since dmax = O(n^), the conductance is bounded from below by Q (;^) and our random walk mixes in 
polynomial time. I 
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Figure by MIT OpenCourseWare. 
Figure 5: Unwinding a collection of cycles (type B path). 
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Figure 6: The encoding aT{s,t). 
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